Ill 



SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Choi et. al. 

(ii) TITLE OF INVENTION: Streptococcus pneumoniae Antigens and 
Vaccines 

(iii) NUMBER OF SEQUENCES: 452 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: USA 

(F) ZIP: 20850 



(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 



(Vi) CURRENT APPLICATION DATA: 

(A) * APPLICATION NUMBER : Unas signed 

(B) FILING DATE : Herewith 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/961 , 083 

( B ) FILING DATE:OCT-30-1997 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Michelle S. Marks 

{ B ) REGISTRATION NUMBER: 41,971 

(C) REFERENCE/ DOCKET NUMBER: PB340P2C1 



(vi) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-8512 



(2) INFORMATION FOR SEQ ID NO: 1: 

( i) SEQUENCE- CHARACTERISTICS s 

(A) LENGTH: 1999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

TAAAATCTAC GACAATAAAA ATCAACTCAT TGCTGACTTG GGTTCTGAAC GCCGCGTCAA 60 

TGCCCAAGCT AATGATATTC CCACAGATTT GGTTAAGGCA ATCGTTTCTA TCGAAGACCA 120 

TCGCTTCTTC GACCACAGGG GGATTGATAC CATCCGTATC CTGGGAGCTT TCTTGCGCAA 18 0 

TCTGCAAAGC AATTCCCTCC AAGGTGGATC AACTCTCACC CAACAGTTGA TTAAGTTGAC 240 

TTACTTTTCA ACTTCGACTT CCGACCAGAC TATTTCTCGT AAGGCTCAGG AAGCTTGGTT 30 0 

AGCGATTCAG TTAGAACAAA AAGCAACCAA GCAAGAAATC TTGACCTACT ATATAAATAA 360 

GGTCTACATG TCTAATGGGA ACTATGGAAT GCAGACAGCA GCTCAAAACT ACTATGGTAA 42 0 

AGACCTCAAT AATTTAAGTT TACCTCAGTT AGCCTTGCTG GCTGGAATGC CTCAGGCACC 480 

AAACCAATAT GACCCCTATT CACATCCAGA AGCAGCCCAA GACCGCCGAA ACTTGGTCTT 540 

ATCTGAAATG AAAAATCAAG GCTACATCTC TGCTGAACAG TATGAGAAAG CAGTCAATAC 600 

ACCAATTACT GATGGACTAC AAAGTCTCAA ATCAGCAAGT AATTACCCTG CTTACATGGA 660 

TAATTACCTC AAGGAAGTCA TCAATCAAGT TGAAGAAGAA ACAGGCTATA ACCTACTCAC 72 0 

AACTGGGATG GATGTCTACA CAAATGTAGA CCAAGAAGCT CAAAAACATC TGTGGGATAT 780 

TTACAATACA GACGAATACG TTGCCTATCC AGACGATGAA TTGCAAGTCG CTTCTACCAT 840 

TGTTGATGTT TCTAACGGTA AAGTCATTGC CCAGCTAGGA GCACGCCATC AGTCAAGTAA 900 

TGTTTCCTTC GGAATTAACC AAGCAGTAGA AACAAACCGC GACTGGGGAT CAACTATGAA 960 

AC C GATC AC A GACTATGCTC CTGCCTTGGA GTACGGTGTC TACGATTCAA CTGCTACTAT 1020 

CGTTCACGAT GAGCCCTATA ACTACCCTGG GACAAATACT CCTGTTTATA ACTGGGATAG 1080 

GGGCTACTTT GGCAACATCA CCTTGCAATA CGCCCTGCAA CAATCGCGAA ACGTCCCAGC 1140 

CGTGGAAACT CTAAACAAGG TCGGACTCAA CCGCGCCAAG ACTTTCCTAA ATGGTCTAGG 1200 

AATCGACTAC CCAAGTATTC ACTACTCAAA TGCCATTTCA AGTAACACAA CCGAATCAGA 1260 

CAAAAAATAT GGAGCAAGTA GTGAAAAGAT GGCTGCTGCT TACGCTGCCT TTGCAAATGG 13 20 

TGGAACTTAC TATAAACCAA TGTATATCCA TAAAGTCGTC TTTAGTGATG GGAGTGAAAA 1380 

AGAGTTCTCT AATGTCGGAA CTCGTGCCAT GAAGGAAACG ACAGCCTATA TGATGACCGA 1440 

CATGATGAAA ACAGTCTTGA CTTATGGAAC TGGACGAAAT GCCTATCTTG CTTGGCTCCC 1500 

TCAGGCTGGT AAAACAGGAA CCTCTAACTA TACAGACGAG GAAATTGAAA ACCACATCAA 156 0 

GACCTCTCAA TTTGTAGCAC CTGATGAACT ATTTGCTGGC TATAC GCGTA AATATTCAAT 162 0 

GGCTGTATGG ACAGGCTATT CTAACCGTCT GACACGACTT GTAGGCAATG GCCTTACGGT 168 0 

CGC TGCCAAA GTTTACCGCT CTATGATGAC CTACCTGTCT GAAGGAAGCA ATCCAGAAGA 1740 
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TTGGAATATA CCAGAGGGGC TCTACAGAAA TGGAGAATTC GTATTTAAAA ATGGTGCTCG 1800 

TTCTACGTGG AACTCACCTG CTCCACAACA ACCCCCATCA ACTGAAAGTT CAAGCTCATC 1860 

ATCAGATAGT TCAACTTCAC AGTCTAGCTC AACCACTCCA AGCACAAATA ATAGTACGAC 192 0 

TACCAATCCT AACAATAATA CGCAACAATC AAATACAACC CCTGATCAAC AAAATCAGAA 198 0 

TCCTCAACCA GCACAACCA 1999 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 amino acids 
{B> TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Lys lie Tyr Asp Asn Lys Asn Gin Leu lie Ala Asp Leu Gly Ser Glu 
15 10 15 

Arg Arg Val Asn Ala Gin Ala Asn Asp lie Pro Thr Asp Leu Val Lys 
20 25 30 

Ala lie Val Ser lie Glu Asp His Arg Phe Phe Asp His Arg Gly lie 
35 40 45 

Asp Thr lie Arg lie Leu Gly Ala Phe Leu Arg Asn Leu Gin Ser Asn 
50 55 60 

Ser Leu Gin Gly Gly Ser Thr Leu Thr Gin Gin Leu lie Lys Leu Thr 
65 70 75 80 

Tyr Phe Ser Thr Ser Thr Ser Asp Gin Thr He Ser Arg Lys Ala Gin 
85 90 95 

Glu Ala Trp Leu Ala He Gin Leu Glu Gin Lys Ala Thr Lys Gin Glu 
100 105 110 

He Leu Thr Tyr Tyr He Asn Lys Val Tyr Met Ser Asn Gly Asn Tyr 
115 120 125 

Gly Met Gin Thr Ala Ala Gin Asn Tyr Tyr Gly Lys Asp Leu Asn Asn 
130 135 140 

Leu Ser Leu Pro Gin Leu Ala Leu Leu Ala Gly Met Pro Gin Ala Pro 
145 150 155 160 

Asn Gin Tyr Asp Pro Tyr Ser His Pro Glu Ala Ala Gin Asp Arg Arg 
165 170 175 

Asn Leu Val Leu Ser Glu Met Lys Asn Gin Gly Tyr He Ser Ala Glu 
180 ' 185 190 

Gin Tyr Glu Lys Ala Val Asn Thr Pro He Thr -Asp Gly^-Leu Gin Ser 
195 200 205 

Leu Lys Ser Ala Ser Asn Tyr Pro Ala Tyr Met Asp Asn Tyr Leu Lys 
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210 



215 



220 



Glu Val He Asn Gin Val Glu Glu Glu Thr Gly Tyr Asn Leu Leu Thr 
225 230 235 240 

Thr Gly Met Asp Val Tyr Thr Asn Val Asp Gin Glu Ala Gin Lys His 
245 250 255 

Leu Trp Asp He Tyr Asn Thr Asp Glu Tyr Val Ala Tyr Pro Asp Asp 
260 265 270 

Glu Leu Gin Val Ala Ser Thr He Val Asp Val Ser Asn Gly Lys Val 
275 280 285 

He Ala Gin Leu Gly Ala Arg His Gin Ser Ser Asn Val Ser Phe Gly 
290 295 300 

lie Asn Gin Ala Val Glu Thr Asn Arg Asp Trp Gly Ser Thr Met Lys 
305 310 315 320 

Pro He Thr Asp Tyr Ala Pro Ala Leu Glu Tyr Gly Val Tyr Asp Ser 
325 330 335 

Thr Ala Thr He' Val His Asp Glu Pro Tyr Asn Tyr Pro Gly Thr Asn 
340 345 350 

Thr Pro Val Tyr Asn Trp Asp Arg Gly Tyr Phe Gly Asn He Thr Leu 
355 360 365 

Gin Tyr Ala Leu Gin Gin Ser Arg Asn Val Pro Ala Val Glu Thr Leu 
370 375 380 

Asn Lys Val Gly Leu Asn Arg Ala Lys Thr Phe Leu Asn Gly Leu Gly 
385 390 395 400 

He Asp Tyr Pro Ser He His Tyr Ser Asn Ala He Ser Ser Asn Thr 
405 410 415 

Thr Glu Ser Asp Lys Lys Tyr Gly Ala Ser Ser Glu Lys Met Ala Ala 
420 425 430 

Ala Tyr Ala Ala Phe Ala Asn Gly Gly Thr Tyr Tyr Lys Pro Met Tyr 
435 440 ^ 445 

He His Lys Val Val Phe^ Ser Asp Gly Ser Glu Lys Glu Phe Ser Asn 
450 455 460 

Val Gly Thr Arg Ala Met Lys Glu Thr Thr Ala Tyr Met Met Thr Asp 
465 470 475 480 

Met Met Lys Thr Val Leu Thr Tyr Gly Thr Gly Arg Asn Ala Tyr Leu 



Ala Trp Leu Pro Gin Ala Gly Lys Thr Gly Thr Ser Asn Tyr Thr Asp 
500 505 510 

Glu Glu He Glu Asn His He Lys Thr Ser Gin Phe Val Ala Pro Asp 

515 520 525 

Glu Leu Phe Ala Gly Tyr Thr Arg Lys Tyr Ser Met Ala Val Trp_Thr.__ 



485 



490 



495 



530 



535 



540 



Gly Tyr Ser Asn Arg Leu Thr Pro Leu Val Gly Asn Gly Leu Thr Val 
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545 550 555 560 

Ala Ala Lys Val Tyr Arg Ser Met Met Thr Tyr Leu Ser Glu Gly Ser 
565 570 575 

Asn Pro Glu Asp Trp Asn lie Pro Glu Gly Leu Tyr Arg Asn Gly Glu 
580 585 590 

Phe Val Phe Lys Asn Gly Ala Arg Ser Thr Trp Asn Ser Pro Ala Pro 
595 600 605 

Gin Gin Pro Pro Ser Thr Glu Ser Ser Ser Ser Ser Ser Asp Ser Ser 
610 615 620 

Thr Ser Gin Ser Ser Ser Thr Thr Pro Ser Thr Asn Asn Ser Thr Thr 
625 630 635 640 

Thr Asn Pro Asn Asn Asn Thr Gin Gin Ser Asn Thr Thr Pro Asp Gin 
645 650 655 

Gin Asn Gin Asn Pro Gin Pro Ala Gin Pro 
660 665 

(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1714 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AAATTACAAT ACGGACTATG AATTGACCTC TGGAGAAAAA TTACCTCTTC CTAAAGAGAT 60 

TTCAGGTTAC ACTTATATTG GATATATCAA AGAGGGAAAA ACGACTTCTG AGTCTGAAGT 120 

AAGTAATCAA AAGAGTTCAG TTGCCACTCC TACAAAACAA CAAAAGGTGG ATTATAATGT 18 0 

TACACCGAAT TTTGTAGACC ATCCATCAAC AGTACAAGCT ATTCAGGAAC AAAC AC CTGT 240 

TTCTTCAACT AAGCCGACAG AAGTTCAAGT AGTTGAAAAA CCTTTCTCTA CTGAATTAAT 300 

CAATCCAAGA AAAGAAGAGA AACAATCTTC AGATTCTCAA GAACAATTAG CCGAACATAA 360 

GAATCTAGAA ACGAAGAAAG AGGAGAAGAT TTCTCCAAAA GAAAAGACTG GGGTAAATAC 420 

ATTAAATCCA CAGGATGAAG TTTTATCAGG TCAATTGAAC AAAC CTGAAC TCTTATATCG 480 

TGAGGAAACT ATGGAGACAA AAATAGATTT TCAAGAAGAA ATTCAAGAAA ATCCTGATTT 540 

AGCTGAAGGA ACTGTAAGAG TAAAACAAGA AGGTAAATTA GGTAAGAAAG TTGAAATCGT 600 
CAGAATATTC TCTGTAAACA AGGAAGAAGT TTCGCGAGAA ATTGTTTCAA CTTCAACGAC ' 660 

TGCGCCTAGT CCAAGAATAG TCGAAAAAGG TACTAAAAAA ACTCAAGTTA TAAAGGAACA 720 

ACCTGAGACT GGTGTAGAAC AT AAGG AC GT ACAGTCTGGA GCTATTGTTG AACCCGCAAT 780 

TCAGCCTGAG TTGCCCGAAG CTGTAGTAAG TGACAAAGGC GAACCAGAAG TTCAACCTAC 840 
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ATTACCCGAA GCAGTTGTGA CCGACAAAGG TGAGACTGAG GTTCAACCAG AGTCGCCAGA 900 

TACTGTGGTA AGTGATAAAG GTGAACCAGA GCAGGTAGCA CCGCTTCCAG AATATAAGGG 960 

TAATATTGAG CAAGTAAAAC CTGAAACTCC GGTTGAGAAG ACCAAAGAAC AAGGTCCAGA 102 0 

AAAAACTGAA GAAGTTCCAG TAAAACCAAC AGAAGAAACA CCAGTAAATC CAAATGAAGG 1080 

TACTACAGAA GGAACCTCAA TTCAAGAAGC AGAAAATCCA GTTCAACCTG CAGAAGAATC 1140 

AACAACGAAT TCAGAGAAAG TATCACCAGA TACATCTAGC AAAAATACTG GGGAAGTGTC 1200 

CAGTAATCCT AGTGATTCGA CAACCTCAGT TGGAGAATCA AATAAACCAG AACATAATGA 1260 

CTCTAAAAAT GAAAATTCAG AAAAAACTGT AGAAGAAGTT CCAGTAAATC CAAATGAAGG 1320 

CACAGTAGAA GGTACCTCAA ATCAAGAAAC AGAAAAACCA GTTCAACCTG CAGAAGAAAC 13 80 

ACAAACAAAC TCTGGGAAAA TAGCTAACGA AAATACTGGA GAAGTATC C A ATAAACCTAG 1440 

TGATTCAAAA CCACCAGTTG AAGAATCAAA TCAACCAGAA AAAAACGGAA CTGCAACAAA 1500 

ACCAGAAAAT TCAGGTAATA CAACATCAGA GAATGGACAA ACAGAACCAG AACCATCAAA 15 60 

CGGAAATTCA ACTGAGGATG TTTCAACCGA ATCAAACACA TCCAATTCAA ATGGAAACGA 1620 

AGAAATTAAA CAAGAAAATG AACTAGACCC TGATAAAAAG GTAGAAGAAC CAGAGAAAAC 1680 

ACTTGAATTA AGAAATGTTT CCGACCTAGA GTTA 1714 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 571 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi> SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Asn Tyr Asn Thr Asp Tyr Glu Leu Thr Ser Gly Glu Lys Leu Pro Leu 
15 10 15 

Pro Lys Glu lie Ser Gly Tyr Thr Tyr lie Gly Tyr lie Lys Glu Gly 
20 25 30 

Lys Thr Thr Ser Glu Ser Glu Val Ser Asn Gin Lys Ser Ser Val Ala 
35 40 45 

Thr Pro Thr Lys Gin Gin Lys Val Asp Tyr Asn Val Thr Pro Asn Phe 
50 55 60 

Val Asp His Pro Ser Thr Val Gin Ala lie Gin Glu Gin Thr Pro Val 
65 70 75 80 

Ser Ser Thr Lys Pro Thr Glu Val Gin Val Val Glu Lys Pro Phe Ser 
85 90 95 

Thr Glu Leu lie Asn Pro Arg Lys Glu Glu Lys Gin Ser Ser Asp Ser 
100 105 110 
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Gin Glu Gin Leu Ala Glu His Lys Asn Leu Glu Thr Lys Lys Glu Glu 
115 120 125 

Lys lie Ser Pro Lys Glu Lys Thr Gly Val Asn Thr Leu Asn Pro Gin 
130 135 140 

Asp Glu Val Leu Ser Gly Gin Leu Asn Lys Pro Glu Leu Leu Tyr Arg 
145 150 155 160 

Glu Glu Thr Met Glu Thr Lys lie Asp Phe Gin Glu Glu lie Gin Glu 
165 170 175 

Asn Pro Asp Leu Ala Glu Gly Thr Val Arg Val Lys Gin Glu Gly Lys 
180 185 190 

Leu Gly Lys Lys Val Glu lie Val Arg lie Phe Ser Val Asn Lys Glu 
195 200 205 

Glu Val Ser Arg Glu lie Val Ser Thr Ser Thr Thr Ala Pro Ser Pro 
210 215 220 

Arg He Val Glu Lys Gly Thr Lys Lys Thr Gin Val He Lys Glu Gin 
225 230 235 240 

Pro Glu Thr Gly Val Glu His Lys Asp Val Gin Ser Gly Ala He Val 
245 250 255 

Glu Pro Ala He Gin Pro Glu Leu Pro Glu Ala Val Val Ser Asp Lys 
260 265 270 

Gly Glu Pro Glu Val Gin Pro Thr Leu Pro Glu Ala Val Val Thr Asp 
275 280 285 

Lys Gly Glu Thr Glu Val Gin Pro Glu Ser Pro Asp Thr Val Val Ser 
290 295 300 

Asp Lys Gly Glu Pro Glu Gin Val Ala Pro Leu Pro Glu Tyr Lys Gly 
305 310 315 320 

Asn He Glu Gin Val Lys Pro Glu Thr Pro Val Glu Lys Thr Lys Glu 
325 330 335 

Gin Gly Pro Glu Lys Thr Glu Glu Val Pro Val Lys Pro Thr Glu Glu 
340 345 350 

Thr Pro Val Asn Pro Asn Glu Gly Thr Thr Glu Gly Thr Ser He Gin 
355 360 365 

Glu Ala Glu Asn Pro Val Gin Pro Ala Glu Glu Ser Thr Thr Asn Ser 
370 375 380 

Glu Lys Val Ser Pro Asp Thr Ser Ser Lys Asn Thr Gly Glu Val Ser 
385 390 395 400 

Ser Asn Pro Ser Asp Ser Thr Thr Ser Val Gly Glu Ser Asn Lys Pro 
405 410 415 

Glu His Asn Asp Ser Lys Asn Glu Asn Ser Glu Lys Thr Val Glu Glu 
420 425 430 



Val Pro Val Asn Pro Asn Glu Gly Thr Val Glu Gly Thr Ser Asn Gin 
435 440 445 
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Glu Thr Glu Lys Pro Val Gin Pro Ala Glu Glu Thr Gin Thr Asn Ser 
450 455 460 

Gly Lys lie Ala Asn Glu Asn Thr Gly Glu Val Ser Asn Lys Pro Ser 
465 470 475 480 

Asp Ser Lys Pro Pro Val Glu Glu Ser Asn Gin Pro Glu Lys Asn Gly 
485 490 495 

Thr Ala Thr Lys Pro Glu Asn Ser Gly Asn Thr Thr Ser Glu Asn Gly 
500 505 510 

Gin Thr Glu Pro Glu Pro Ser Asn Gly Asn Ser Thr Glu Asp Val Ser 
515 520 525 

Thr Glu Ser Asn Thr Ser Asn Ser Asn Gly Asn Glu Glu lie Lys Gin 
530 535 540 

Glu Asn Glu Leu Asp Pro Asp Lys Lys Val Glu Glu Pro Glu Lys Thr 
545 550 555 560 

Leu Glu Leu Arg Asn Val Ser Asp Leu Glu Leu 
565 570 

( 2 ) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 748 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 

TGAGAATCAA GCTACACCCA AAGAGACTAG CGCTCAAAAG ACAATCGTCC TTGCTACAGC 60 

TGGCGACGTG CCACCATTTG ACTACGAAGA CAAGGGCAAT CTGACAGGCT TTGATATCGA 120 

AGTTTTAAAG GCAGTAGATG AAAAACTCAG CGACTACGAG ATTCAATTCC AAAGAACCGC 180 

CTGGGAGAGC 'ATCTTCCCAG GACTTGATTC TGGTCACTAT CAGGCTGCGG C CAATAACTT 240 

GAGTTACACA AAAGAGCGTG CTGAAAAATA CCTTTACTCG CTTCCAATTT CCAACAATCC 3 00 

CCTCGTCCTT GTCAGCAACA AGAAAAATCC TTTGACTTCT CTTGACCAGA TCGCTGGTAA 3 60 

AACAACACAA GAGGATACCG GAACTTCTAA CGCTCAATTC ATCAATAACT GGAATCAGAA 420 

ACACACTGAT AATCCCGCTA CAATTAATTT TTCTGGTGAG GATATTGGTA AACGAATCCT 480 

AGACCTTGCT AACGGAGAGT TTGATTTCCT AGTTTTTGAC AAGGTATCCG TTCAAAAGAT 540 

TATCAAGGAC CGTGGTTTAG ACCTCTCAGT CGTTGATTTA CCTTCTGCAG ATAGC C CC AG 600 
CAATTATATC ATTTTCTCAA GCGACCAAAA AGAGTTTAAA GAGCAATTTG ATAAAGCGCT ^ 660 

CAAAGAACTC TATCAAGACG GAACCCTTGA AAAACTCAGC AATACCTATC TAGGTGGTTC 720 

TTACCTCCCA GATCAATCTC AGTTACAA 748 
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(2) INFORMATION FOR SEQ ID NO : 6 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 249 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

Glu Asn Gin Ala Thr Pro Lys Glu Thr Ser Ala Gin Lys Thr lie Val 
15 10 15 

Leu Ala Thr Ala Gly Asp Val Pro Pro Phe Asp Tyr Glu Asp Lys Gly 
20 25 30 

Asn Leu Thr Gly Phe Asp lie Glu Val Leu Lys Ala Val Asp Glu Lys 
35 40 45 

Leu Ser Asp Tyr Glu lie Gin Phe Gin Arg Thr Ala Trp Glu Ser lie 
50 55 60 

Phe Pro Gly Leu Asp Ser Gly His Tyr Gin Ala Ala Ala Asn Asn Leu 
65 70 75 80 

Ser Tyr Thr Lys Glu Arg Ala Glu Lys Tyr Leu Tyr Ser Leu Pro lie 
85 90 95 

Ser Asn Asn Pro Leu Val Leu Val Ser Asn Lys Lys Asn Pro Leu Thr 
100 105 110 

Ser Leu Asp Gin lie Ala Gly Lys Thr Thr Gin Glu Asp Thr Gly Thr 
115 120 125 

Ser Asn Ala Gin Phe lie Asn Asn Trp Asn Gin Lys His Thr Asp Asn 
130 135 140 

Pro Ala Thr lie Asn Phe Ser Gly Glu Asp lie Gly Lys Arg lie Leu 
145 150 155 160 

Asp Leu Ala Asn Gly Glu Phe Asp Phe Leu Val Phe Asp Lys Val Ser 



Val Gin Lys lie lie Lys Asp Arg Gly Leu Asp Leu Ser Val Val Asp 
180 185 190 

Leu Pro Ser Ala Asp Ser Pro Ser Asn Tyr lie lie Phe Ser Ser Asp 
195 200 205 

Gin Lys Glu Phe Lys Glu Gin Phe Asp Lys Ala Leu Lys Glu Leu Tyr 
210 215 220 

Gin Asp Gly Thr Leu Glu Lys Leu Ser Asn Thr Tyr Leu Gly Gly Ser 



165 



170 



175 



225 



230 



235 



240 



Tyr Leu 



Pro 



Asp 



Gin 
245 



Ser 



Gin 



Leu 



Gin 



(2) INFORMATION FOR SEQ ID NO: 7 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TGGTAACCGC TCTTCTCGTA ACGCAGCTTC ATCTTCTGAT GTGAAGACAA AAGCAGCAAT 60 

CGTCACTGAT ACTGGTGGTG TTGATGACAA ATCATTCAAC CAATCAGCTT GGGAAGGTTT 120 

GCAGGCTTGG GGTAAAGAAC ACAATCTTTC AAAAGATAAC GGTTTCACTT ACTTCCAATC 180 

AACAAGTGAA GCTGACTACG CTAACAACTT GCAACAAGCG GCTGGAAGTT ACAACCTAAT 240 

CTTCGGTGTT GGTTTTGCCC TTAATAATGC AGTTAAAGAT GCAGCAAAAG AACACACTGA 3 00 

CTTGAACTAT GTCTTGATTG ATGATGTGAT TAAAGACCAA AAGAATGTTG CGAGCGTAAC 3 60 

TTTCGCTGAT AATGAGTCAG GTTACCTTGC AGGTGTGGCT GCAGCAAAAA CAACTAAGAC 420 

AAAACAAGTT GGTTTTGTAG GTGGTATCGA ATCTGAAGTT ATCTCTCGTT TTGAAGCAGG 430 

ATTCAAGGCT GGTGTTGCGT CAGTAGACCC ATCTATCAAA GTCCAAGTTG ACTACGCTGG 540 

TTCATTTGGT GATGCGGCTA AAGGTAAAAC AATTGCAGCC GCACAATACG CAGCCGGTGC 600 

AGATATTGTT TACCAAGTAG CTGGTGGTAC AGGTGCAGGT GTCTTTGCAG AGGCAAAATC 660 

TCTCAACGAA AGCCGTCCTG AAAATGAAAA AGTTTGGGTT ATCGGTGTTG ATCGTGACCA 720 

AGAAGCAGAA GGTAAATACA CTTCTAAAGA TGGCAAAGAA TCAAACTTTG TTCTTGTATC 78 0 

TAC TTTG AAA CAAGTTGGTA CAACTGTAAA AGATATTTCT AACAAGGCAG AAAGAGGAGA 840 

ATTCCCTGGC GGTCAAGTGA TCGTTTACTC ATTGAAGGAT AAAGGGGTTG ACTTGGCAGT 900 

AACAAACCTT TCAGAAGAAG GTAAAAAAGC TGTCGAAGAT GCAAAAGCTA AAATCCTTGA 960 

TGGAAGCGTA AAAGTTCCTG AAAAA 985 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 328 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Gly Asn Arg Ser Ser Arg Asn Ala Ala Ser Ser Ser Asp Val Lys Thr 
15 10 15 

Lys Ala Ala lie Val_Thr_Asp Thr Gly Gly Val Asp Asp Lys Ser Phe 
20 25 30 

Asn Gin Ser Ala Trp Glu Gly Leu Gin Ala Trp Gly Lys Glu His Asn 
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35 40 45 

Leu Ser Lys Asp Asn Gly Phe Thr Tyr Phe Gin Ser Thr Ser Glu Ala 
50 55 60 

Asp Tyr Ala Asn Asn Leu Gin Gin Ala Ala Gly Ser Tyr Asn Leu lie 
65 70 75 80 

Phe Gly Val Gly Phe Ala Leu Asn Asn Ala Val Lys Asp Ala Ala Lys 
85 90 95 

Glu His Thr Asp Leu Asn Tyr Val Leu lie Asp Asp Val lie Lys Asp 
100 105 110 

Gin Lys Asn Val Ala Ser Val Thr Phe Ala Asp Asn Glu Ser Gly Tyr 
115 120 125 

Leu Ala Gly Val Ala Ala Ala Lys Thr Thr Lys Thr Lys Gin Val Gly 
130 135 140 

Phe "Val Gly Gly lie Glu Ser Glu Val lie Ser Arg Phe Glu Ala Gly 
145 150 155 160 

Phe Lys Ala Gly Val Ala Ser Val Asp Pro Ser lie Lys Val Gin Val 
165 170 175 

Asp Tyr Ala Gly Ser Phe Gly Asp Ala Ala Lys Gly Lys Thr lie Ala 
180 185 190 

Ala Ala Gin Tyr Ala Ala Gly Ala Asp lie Val Tyr Gin Val Ala Gly 
195 200 205 

Gly Thr Gly Ala Gly Val Phe Ala Glu Ala Lys Ser Leu Asn Glu Ser 
210 215 220 

Arg Pro Glu Asn Glu Lys. Val Trp Val lie Gly Val Asp Arg Asp Gin 
225 230 235 240 

Glu Ala Glu Gly Lys Tyr Thr Ser Lys Asp Gly Lys Glu Ser Asn Phe 

245 250 , 255 

Val Leu Val Ser Thr Leu Lys Gin Val Gly Thr Thr Val Lys Asp lie 
260 265 270 

Ser Asn Lys Ala Glu Arg Gly Glu Phe Pro Gly Gly Gin Val lie Val 
275 280 285 

Tyr Ser Leu Lys Asp Lys Gly Val Asp Leu Ala Val Thr Asn Leu Ser 
290 295 300 

Glu Glu Gly Lys Lys Ala Val Glu Asp Ala Lys Ala Lys lie Leu Asp 
305 310 315 320 

Gly Ser Val Lys Val Pro Glu Lys 
325 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1404 base pairs - 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



122 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9: 

TGTGGAAATT TGACAGGTAA CAGCAAAAAA GCTGCTGATT CAGGTGACAA ACCTGTTATC 60 

AAAATGTACC AAATC GGTGA CAAACCAGAC AACTTGGATG AATTGTTAGC AAATGCCAAC 120 

AAAATCATTG AAGAAAAAGT TGGTGCCAAA TTGGATATCC AATACCTTGG CTGGGGTGAC 180 

TATGGTAAGA AAATGTCAGT T AT C AC AT C A TCTGGTGAAA ACTATGATAT TGCCTTTGCA 24 0 

GATAACTATA TTGTAAATGC TCAAAAAGGT GCTTACGCTG ACTTGACAGA ATTGTACAAA 3 00 

AAAGAAGGTA AAGACCTTTA CAAAGCACTT GACCCAGCTT ACATCAAGGG TAATACTGTA 3 60 

AATGGTAAGA TTTACGCTGT TCCAGTTGCA GCCAACGTTG CATCATCTCA AAACTTTGCC 420 

TTCAACGGAA CTCTCCTTGC TAAATATGGT ATCGATATTT CAGGTGTTAC TTCTTACGAA 48 0 

ACTCTTGAGC CAGTCTTGAA ACAAATCAAA GAAAAAGCTC CAGACGTAGT ACCATTTGCT 540 

ATTGGTAAAG TTTTCATCCC ATC TGATAAT TTTGACTACC CAGTAGCAAA CGGTCTTCCA 600 

TTCGTTATCG ACCTTGAAGG CGATACTACT AAAGTTGTAA ACCGTTACGA AGTGCCTCGT 660 

TTCAAAGAAC ACTTGAAGAC T C TT C AC AAA TTCTATGAAG CTGGCTACAT TCCAAAAGAC 72 0 

GTCGCAACAA GCGATACTTC CTTTGACCTT CAACAAGATA CTTGGTTCGT TCGTGAAGAA 780 

ACAGTAGGAC CAGCTGACTA CGGTAACAGC TTGCTTTCAC GTGTTGCCAA CAAAGATATC 840 

CAAATCAAAC CAATTACTAA CTTCATCAAG NAAAACCAAA CAACACAAGT TGCTAACTTT 900 

GTCATCTCAA ACAACTCTAA GAACAAAGAA AAATCAATGG AAATCTTGAA CCTCTTGAAT 960 

AC GAACC C AG AACTCTTGAA CGGTCTTGTT TACGGTCCAG AAGGCAAGAA CTGGGAAAAA 1020 

ATTGAAGGTA AAGAAAACCG TGTTCGCGTT CTTGATGGCT ACAAAGGAAA CACTCACATG 1080 

GGTGGATGGA ACACTGGTAA CAACTGGATC CTTTACATCA ACGAAAACGT TACAGACCAA 1140 

CAAATCGAAA ATTCTAAGAA AGAATTGGCA GAAGCTAAAG AATCTCCAGC GCTTGGATTT 1200 

ATCTTCAATA CTGACAATGT GAAATCTGAA ATCTCAGCTA TTGCTAACAC AATGCAACAA 12 60 

TTTGATACAG CTATCAACAC TGGTACTGTA GACCCAGATA AAGC GATTCC AGAATTGATG 1320 

GAAAAATTGA AATCTGAAGG TGCCTACGAA AAAGTATTGA ACGAAATGCA AAAACAATAC 13 80 

GATGAATTCT TGAAAAACAA AAAA 1404 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



